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Two of the groups whose relationship represents one 
of the most critical social problem areas are also tiie groups on whom 
there is a large amount of data. These are American Negroes and 
American Caucasians, or in today's terminology, blacks and whites. 
Some data relevant to the design of programs of social action suggest 
the following: The mean intellectual deficit of the black group is a 
general one and not restricted to measures stressing the use of,, 
standard English. The mean intellectual deficit occurs early and 
changes very little between the first grade and high school 
graduation. A third and closely-related set of data shows that black 
schools are not substantially inferior to white schools as measured 
by the usual economic arid demographic characteristics. There is nc 
basis in the psychology of learning or of human abilities for an 
assumption that environmental deficits can be quickly and easily 
overcome given freedom of opportunity. If one starts with groups cf 
black and white women who are about equal in intellectual ability at 
a level well below the national norm, the offspring of the blacks 
will have a lower mean than the offspring of the whites. These data 
are admittedly distressing.. There appear to be no easy solutions. It 
is quite clear, for example, that the preschool period is very 
important and is also presently beyond the reach of the /public 
schools. Families and local communities must assume more 
responsibility at this atage. (Author/JM) 
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I shall first summarize the principal conclusions of my ETS Testing 



Conference paper of last fall and then proceed with some additional material 

OC 



which helps x.o place the problem of test fairness, both for the individual and 
for use in selection, in a more realistic perspective. 

Accuracy in prediction or of inference can be analyzed into two tradi- 
tional components: sizes of r^^dom and of constanl: errors • Accuracy is a 
quantitative concept; thus some *^^ans of estimating the tv:o kinds of errors 
is required. In order to estimate random error, the standard error of estimate 
is recommeudGd and is ordinarily sufficient. In order to estimate constant 
error, the individual must be placed in some group and the group's criterion 
performance compared with some standard group, holding test scores constant 
in the two groups ^ 

Fairness in prediction or inference is directly related to the measuros 
of accuracy: minimization within feasible limits of random error and allowance 
for constant error. The statistical estimation of fairness is incorporated in 
the comparison of regression equations predicting criterion score from predic- 
tor test. In equal ranges of talent the slopes of the regression lines are 
inversely related to the amount of random error, while the intercepts, if the 
lines have equal slopes, reflect the amount and direction of constant error. 
If the two lines do not have equal slopes, the points of intersection of the 



two lines become the matter of concern. 
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Unfortunately, the problem of assessing fairness in practice is net as 
straightforward as the preceding sounds. Perfect fairness is impossible to 
attain. Firstly, there are no zero differences in nature; tiny nonzero dif- 
ferences can always be detected if enoup.h observations are made. Secondly, 
there are a huge number of possible groups to which a civen exam^inee mipht 
belong, if groups are to be defined as hotnopeneously as possible; or if only a 
few major groups are defined, a p,iver. individual may belong to several. In th 
former case all possible rrroups cannot be investir;ated . .Thirdly, if by chance 
two regression lines appear to be identical after nskinf*^ many observations, a 
change in the reliability of the test v/ill produce a difference in intercepts. 
Fourthly, if each of the refressions of the criterion nt.nscro on tv;o separate 
tests appears to be identical in the two groups after mpkinr many observations 
the regressions involving a composite of the tv:o predictors will not be ideiij.i 
cal. (See Linn and Werts for a fuller discussion of these last two points,) 

Under these circumstances there ip only one feasible course of action, 
since to insist on perfect fairness under all circumstances would make the ur.c 
of tests impossible. The probable amount of difference the use of a single 
regression equation is likely to make inu^t be determined. Social significance 
of the difference rather than statistical significance is the irore important 
consideration once the null hypothesis has been rejected. Both the empirical 
comparisons of repressions that have been made for several demographic groupc 
and ability theory serve as bases for this determination. The importance of 
the size of differences expected can be determined by askinp questions such 
as the following: Are the sizes of the errors from the use of a single regres 
sion equation likely to be smaller than the sampling errors of the regression 
statistics computed in the smaller of the tv70 groups? Does the probable size 
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and sign of the error in the use of a sinrle equation affect adversely the 
more or less aggrieved group? The siraller or larger of the two groups? 

Tlie empirical base for use of a sinple equation with a few critical 
groups is still all too small, but is now rapidly growing. Fortunately, also, 
for purposes of generalizinf! the data to date appear to be relatively homogen- 
eous in outcome. For blacks and whites a presently valid generalization .* s 
that slopes of regression lines are highly similar and a small slope intercept 
favoring the v/hite group is found nore frequently than thi? reverse. IJhere 
slopes are not highly similar, and the black slope is lov;er, there still te-nds 
to be a small degree of overpredict ion of blark. pcrfovmance on the criterion 
measure. One can siiuimarize by saying that blacks best do as well on the 
criterion measure as predicted by their test scoircc. 

There is also a modest amount of data available on sex differences in 
regressions. Again slopes of lines are similar and there is frequently a 
small degree of overprediction of male performance on the criterion ireasure. 
A summary statement is that females do at least as well on the criterion . 
measure as they do on the test. A small amount of data is also available 
for social cla^sT^aAd area of residence groups. Results appear to be similar; 
differences v;hen they occur are generally small enough to be disregarded. 
A possible exception is the male-female intercept difference in college 
grades, but this could be reduced in size if controls on courses and majors 
were imposed. 

The theoretical basis for these findings of comparability of regressions, 
and for an expectation that the present findings will be extended, is based 
upon the well-knoim body of psychological knowledge labelled "transfer." 
Tests are valid v;hen both the test and the criterion measure sample the same 
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Tcnowledge and skills. The frreater the amount of overlap between test and 
criterion, the hip^her is the correlation between the two measures. The causes 
that produce individual differences on tes.s also produce individual differ- 
ences on the criterion measures. It also follows that the causes that produce 
lov;er moan test scores for certain groups also produce lower mean criterion 
scores for those groups. It is superficial, hox/ever, to dismiss this expla- 
nation for validity as simply the effects of bias in both tests and criteria, 
Reading and arithmetic skills are highly valued by our society and ri'e rcq'jired 
for full participatiou. The society must continue to demand ths', they be 
acquired. 

It is not possible on the basis of present information to present a 
detailed listing of these causes alonr with a contribution to variance of 
each. Instead, only areas of causality can be listed: namely, genetic and 
environmental, Within the latter set, all stages of development from prenatal 
through birth to postnatc^l are involved. For the present, also, it is unwise 
to concentrate on any one en'sjii onmental influence as the critical one. By the 
same line of reasoning. It is unsafe to rule out any particular influence as 
Unimportant, 

Comparability of regressions provides no information about particular 
areas of causation, but the finding does indicate that there has been no 
appreciable compensation for the causes of reduced performsince for the 
lower-scoring group betv/een the administration of the test and the measure- 
ment of criterion performance. This finding is, then, in marked disagreement 
with the expectations of many critics of objective tests. These critics 
assumed, either explicitly or Implicitly, that opportunity and encouragement 
for lower-scoring groups were sufficient to produce substantial underpredic- 
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tlon of criterion jperfonnance from the reduced scores on the test. This 
assumption is false; the problems are much more difficult and resistant to 
solution than these critics assumed. Findings in child development and human 
leamin53 alone, completely without regard to possible cenetic""rrifluences, 
should have led to no re caution." IJishful thinkinfr solves neither scientific 
nor social problems. 

VJishful thinking, hoxjever, does have important effects: it results in 
inadequate and suij^erf icial solutions to problems and in, the long run harms 
many individuals as well. It is highly desirable to face up to facts before 
designing programs of social action. 

Two of the croups x'hose relationship represents one of the most critical 
social problem areas are also the croups o^ whom there is a large amount of 
data. These are American Hep roes and Arn^rican Caucasians, or in today's 
terminology, blacks and whites. (There is no reason to assume that either 
group is representative o£ its race on a world-wide scale.) Some of the 
additional data beyond the repression comparisons relevant to the design of 
programs of social action concerninp tht;«ie groups are now summarized • 

T/ie mean Jint2XJiQ.QJbxaJi rfe^^ccC tliz black gncup -u a ge.nQJtat owe and 
not n.(^t/Uctzd to vAza^fJA^i 6tAZ&6A.ng thz u&z o^ 6tanda/id EngtUh. Some of 
the best data concerning this matter are from the military. In the early 
1950s, for example, at a time when the upper 90% of young aiales were con- 
sidered qualified intellectually for military service, the mean of the race 
differences on Air Force Classification Tests was about one standard devia- 
tion of the white distribution in size. The difference on general vocabulary 
was about at the average of all the differences, while differences for rote 
learning and perceptual speed were smaller, and for mechanical information 
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and arithmetic reasoning higher than average. Uniformity in size of the mean 
differences from test to test, however, v;as more striking than the variability 
in the size of the differences. Army data from the late '60s are highly com- 
parable, indicating no important changes over a period of 15-20 years. 

The man intzZlzcXuaJt dziioit oaaiu za/Uy and dianp.u vaAtj tittle, 
be^oeeii the ^<M/^t gnade and high 6choot g/vxducuUjOn. Obviously, in order to 
make a statement of this sort, one nust make assumptions about the units of 
measurement used over this time span. The oft-quoted statement that the mean 
difference in grade or ape equivalent units increases during the public school 
period is true, but these units becon:e smaller as development proceeds- The 
difference in means remains approximately constant in strndard score units or 
in intelligence or achievement quotient units. It wculd be a remarkable 
educational achievement, for which there is no precedent in any group, if the 
schools could maintain the black group at the same grade placement differen- 
tial throughout the grades. ♦ 

A tiiOid and cJio^ety-Kzlatzd 6qX data ^/ictOA tnat btadi 6ckooi^ oAe not 
6ub6tantlaJiZy tn^z/vioK to loltitz 6ciiooZ^ a6 mea^vJied by tlie u6aal econovivLc coid 
der*iog/iapfvcc ciio/iaC/teA/tA^tccA. The only substantial school differences, as 
distinguished from deficits, also involve race, i.e., the race of teachers, 
Administrators, and parents. Such differences can hardly be translated 
directly into evidence for the inferiority of the schools. The small defi- 
ciencies that do exist, such as per capita dollar expenditures, libraries, 
laboratories, etc., largely disappear when section of the. country is con- 
trolled. Southern schools for both races are inferior in terms of these 
standards, and southern schools enroll a hipher proportion of the nation's 
blacks than of its whites. 



Lloyd G. Hunphreys ' * 7. 

Fou/tXh, a}id al&o clo6zttj nzlaJizd to tixt ■^oK^o.oinQ, ikznz ^ no ba6U> In 
th^ p6ijdiotogy oo Zza/aUng on. o^ liiMUcui ab>itUx.u ion. an a!>6uy^ption tliat znvAJi" 
ommntat dz^iouU ccui be quidiZy and za/i^ij ove^^came g-cvew ^nzzdcm o^ opponr 
tijuiiiXij. This generalizstion depends in part, but only in part, on the 
comparability of regression phenomenon described earlier. It is especially 
important for the adult or near adult. Both abilities and inabilities can 
be acquired. Ho matter how stimulating a new educational or occupational 
environment may be for a hip,h-6chool graduate, the effects of 18 years and 9 
months of a disadvantaged environment, including 18 3^aars of inadequate 
learning, can not be compensated for oveniipht. 

A fifth type of data is not as complete as one would like, but some of 
the trends are quite certain. Ij$ one 6tanXJt> uu^'i Qhoup^ of^ black and t^ltiXz 
toowen ujfio am aboiU zquaZ 4Ji ^intzlZzcJcual abctcty a ZzveZ beZou) the, 
naZional nonr^i, XJiz o^^pnA^ng o^ tliz bZadi& i^uJtZ fiavz a taozn wieoit than tlxz 
0(^{/^pKi)\Q oi tiiz i^lUtO^. ;^ile the children of both groups will regress up- 
wards towards their respective group means, the overall black mean is one 
standard deviation below the white mean. Therefore, the black children will 
be more like their mothers than the white children. There is, accordingly, 
a stronger tendency for a culture of poverty and the concomitant iv5tell»actual 
deprivation to be passed on from one generation to another within black fntn- 
ilies. At the other end of the scale of intelligence, though this conclusion 
Is much less certain than the preceding; one, middle-class black and white 
parents of the same level of ability will probably have children of unequal 
average levels of ability. (Regression In the black r^^oup, hoi^ever, nay not 
be syrranetrical around the mean as it is in the white group.) Black children, 
at this end of the distribution, may regress do\mward more than white children 
although Substantial amounts of regression are expected in both groups. 
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It is generally true that data . involving relationships of intellectual 
tests are not parallel in all disadvantaged croups and that a seemingly 
desirable social action in one group may not have desirable effects in other 
groups. For example, as Jencks et al point out, if this society decided to 
make access to higher education depend entirely on measured aptitude so-called, 
or academic achievement, the children of working class v;hites would E^^^^ appre- 
ciably, black children would lose, and the children of upper mlddle-rlass 
whites would also lose a [rood part of their present privileged status. Para- 
llel results would follo\7 if entry into occupations were also based entirely 
on test scores. On the other hand, if entry Into h*icher education and occu- 
pations were based entirely on asnirations, blach children \;ould be helped, 
the children of workinp-class whites v/ould be hurt, and the children of 
upper middle-class whites would retain their privilej;ed position. These 
particular consequences define a dilermia and produce conflict in a Jeffersonian 
or Jacksonian democrat. 

These data are admittedly distressing. Tliey indicate that the intellec- 
tual deficits of the black group are broad, not narrot^; appear early in 
development, not in the public school period; tend to persist, especially 
in the ^dult, not quickly disappear with freedom of opportunity; and tend 
to be passed on from one fteneration to another prinarily by a stratum in the 
black group, not by the race per se. There appear to.be no easy solutions, 
v;hile the characteristics of the expensive and time-consuming solutions that 
will be required can only be guessed at in the absence of f>ood data, \ It is 
quite clear that the p>>fci8chool period is very important and is also presently 
beyond the reach of the public sdiools. Families and local communities must 
assume more responsibility at this stage. 
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The public schools must also do more. Holdir.'^ their o\m is not suffi- 
cient, but no one knoK?s with certainty hov; the schools must chanf»e to accom- 
plish more. Furthermore, the most promising techniques for accomplishing 
more, though they certainly require tryout on a broad scale over a prolongcJ 
period of time, are very generally ignored by the schools and most teacher 
training institutions. It is unnecessary, as well as unfair, but the term 
"Skinnerian" has become the kiss of death in most of these institutions. 
Their faculties refuse to look at data. 

Simply qualifying irore blacks for uijher education and for entry into 
intellectually-demandlnp occupations by adoptinr; Goce other definition of 
faiimess in selection than the fairness for individuals herein defined is a 
very superficial response to the difficult problems that have been described. 
There is no reason to believe that adopting, a definition that will qualify 
more blacks will produce equality in attainment, even over a period of several 
years, for those so qualified. At best it represents a temporary expedient 
acceptable only for reasons of the social emergency. It does allor>7 a little 
time to deal with some of the more fundamental aspects of the problem; this 
time should be used, hox^ever, and not v;asted. In this light I endorse *;he 
Darlington definition of fairness in selection vmich awards a criterion bonus, 
which can and should vary in size as a function of the criterion, to under- 
privileged groups. This is done for reasons of social policy and has little 
to do V7ith psychometric or even psychological considerations. 




